Hmm-regularization for Nmf-based Noise Robust Asr

نویسندگان

  • Jort F. Gemmeke
  • Antti Hurmalainen
  • Tuomas Virtanen
چکیده

In this work we extend a previously proposed NMF-based technique for speech enhancement of noisy speech to exploit a Hidden Markov Model (HMM). The NMF-based technique works by finding a sparse representation of specrogram segments of noisy speech in a dictionary containing both speech and noise exemplars, and uses the activated dictionary atoms to create a time-varying filter to enhance the noisy speech. In order to take into account larger temporal context and constrain the representation by the grammar of a speech recognizer, we propose to regularize the optimization problem by additionally minimizing the distance between state emission probabilities derived from the speech exemplar activations, and a posteriori state probabilities derived by applying the Forward-Backward algorithm to the emission probabilities. Experiments on Track 1 of the 2nd CHiME Challenge, which contains small vocabulary speech corrupted by both reverberation and authentic living room noise at varying SNRs ranging from 9 to -6 dB, confirm the validity of the proposed technique.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory

This article proposes and evaluates various methods to integrate the concept of bidirectional Long Short-Term Memory (BLSTM) temporal context modeling into a system for automatic speech recognition (ASR) in noisy and reverberated environments. Building on recent advances in Long Short-Term Memory architectures for ASR, we design a novel front-end for contextsensitive Tandem feature extraction a...

متن کامل

Voice quality normalization in an utterance for robust ASR

In this paper, we propose a novel method of normalizing the voice quality in an utterance for both clean speech and speech contaminated by noise. The normalization method is applied to the N-best hypotheses from an HMM-based classifier, then an SM (Sub-space Method)-based verifier tests the hypotheses after normalizing the monophone scores together with the HMMbased likelihood score. The HMM-SM...

متن کامل

HMM2- extraction of formant structures and their use for robust ASR

As recently introduced in [1], an HMM2 can be considered as a particular case of an HMM mixture in which the HMM emission probabilities (usually estimated through Gaussian mixtures or an artificial neural network) are modeled by statedependent, feature-based HMM (referred to as frequency HMM). A general EM training algorithm for such a structure has been developed [2]. Although there are numero...

متن کامل

Investigating Modulation Spectrum Factorization Techniques for Robust Speech Recognition

The performance of an automatic speech recognition (ASR) system often deteriorates sharply due to the interference from varying environmental noise. As such, the development of effective and efficient robustness techniques has long been a challenging research subject in the ASR community. In this article, we attempt to obtain noise-robust speech features through modulation spectrum processing o...

متن کامل

Projective robust nonnegative factorization

Nonnegative matrix factorization (NMF) has been successfully used in many fields as a low-dimensional representation method. Projective nonnegative matrix factorization (PNMF) is a variant of NMF that was proposed to learn a subspace for feature extraction. However, both original NMF and PNMF are sensitive to noise and are unsuitable for feature extraction if data is grossly corrupted. In order...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013